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Abstract —In recent years, low-rank based tensor completion, 
which is a higher-order extension of matrix completion, has re¬ 
ceived considerable attention. However, the low-rank assumption 
is not sufficient for the recovery of visual data, such as color and 
3D images, where the ratio of missing data is extremely high. In 
this paper, we consider “smoothness” constraints as well as low- 
rank approximations, and propose an efficient algorithm for per¬ 
forming tensor completion that is particularly powerful regarding 
visual data. The proposed method admits significant advantages, 
owing to the integration of smooth PARAFAC decomposition for 
incomplete tensors and the efficient selection of models in order to 
minimize the tensor rank. Thus, our proposed method is termed 
as “smooth PARAFAC tensor completion (SPC).” In order to 
impose the smoothness constraints, we employ two strategies, 
total variation (SPC-TV) and quadratic variation (SPC-QV), 
and invoke the corresponding algorithms for model learning. 
Extensive experimental evaluations on both synthetic and real- 
world visual data illustrate the significant improvements of our 
method, in terms of both prediction performance and efficiency, 
compared with many state-of-the-art tensor completion methods. 

Index Terms —Tensor completion for images, smoothness, low- 
rank tensor approximation, CP model, PARAFAC model, total 
variation (TV), quadratic variation. 

1. Introduction 

Completion is a procedure that facilitates the estimation 
of the values of missing elements of array data, using only 
the available elements and structural properties of the data. 
Clearly, if there is no relationship between the missing ele¬ 
ments and the available elements, completion is not possible. 
However, real world data usually exhibits some correlations, 
latent factors, symmetry, continuity, or repetition, in which 
case completion is often possible. For example, when a vector 
consists of the sampled values of some continuous function 
that has several missing values, some interpolation methods 
can be used for completion, such as simple linear interpolation, 
spline interpolation, and polynomial interpolation. When a 
given matrix has several missing entries and the low-rank ma¬ 
trix factorization exists, then a low-rank structure can be used 
for completion, by approximating the given matrix using the 
low-rank factorization model. Such completion techniques are 
closely associated with computer vision, pattern recognition, 
and compressed sensing ifTvl . ll24ll . |[38]| . 

Techniques for vector/matrix completion have been compre¬ 
hensively researched, and many sophisticated methods exist. 
Furthermore, techniques for “tensor” completion have attracted 
attention in recent years, because of their potential applications 
and flexibility. A tensor is a multi-dimensional array, vectors 
and matrices can be considered as flrst-order and second- 
order tensors, respectively. For example, the data for a color- 


image is a third-order tensor, because it consists of three 
color shading images of red, green, and blue. Similarly, the 
data for a color video is a fourth-order tensor, because it 
consists of multiple frames of color images. There have been 
several papers that have attempted the completion color images 
using matrix completion ||30l, 1271 . ITTl . In lIZTl . the authors 
proposed to complete each color shading image separately, and 
subsequently concatenate them. However, such an approach 
ignores the natural multi-dimensional structure of tensors, and 
thus neglects some important information. In ISOl . 1371 . the 
authors applied a matrix completion technique to matrices 
that consist of set of similar patches selected by some patch 
matching algorithms. However, patch matching is generally 
time consuming, and it does not work well when there is an 
extremely high ratio of missing data. 

On the other hand, tensor completion methods have shown 
signiflcant progress in the completion of color images by 
exploiting the structural information of 3rd-order tensors. 
The state-of-the-art methods for tensor completion consist of 
approaches of two types. The first approach, called nuclear 
norm minimization, is based on the low-rank property 14^ . 
EH, EH. The nuclear norm of a matrix is defined as the 
sum of all singular values, and is a convex envelope for the 
rank of matrix ED. The nuclear norm of the tensor was 
first introduced and applied to tensor completion in ll42l . 
Subsequently, some improved algorithms were proposed in 
1^ . Il43]| . The minimization of tensor nuclear norms has also 
been applied in various contexts, such as error correction lf39l . 
1401 . medical imaging l44l . compression 1^ . and saliency 
detection l69l . The second approach involves the use of 
low-rank tensor decomposition techniques, which have been 
proposed in d, El, EH, EH, El, EH, |77l. Tensor 
factorization is a method for decomposing an Ath-order 
tensor into another Ath-order tensor of smaller size, termed 
as the “core tensor,” and A factor matrices. There are two 
factorization models: Tucker decomposition ED and polyadic 
decomposition 1^ . The polyadic decomposition is also called 
several different names such as CANDECOMP (canonical 
decomposition) d, PARAFAC (parallel factor) decomposition 
1^ . and CP (CANDECOMP/PARAFAC) decomposition l32l . 
In the present paper, we usually refer to this model as 
PARAEAC/polyadic decomposition (PD) model. A core tensor 
of PD takes the form of a super-diagonal tensor, and a core ten¬ 
sor of Tucker decomposition takes the form of a general tensor. 
Thus, PD constitutes a special case of Tucker decomposition. 
Similarly to the singular value decomposition (SVD) of matrix 
factorization, the minimum size of a core tensor corresponds to 
the rank of the original tensor. In (121, the authors consider the 
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low rank Tucker decomposition model for tensor completion 
and the minimization of the nuclear norms of individual factor 
matrices. In (771, the authors consider the low-rank PD model 
for tensor completion, by using a Bayesian framework to solve 
tuning parameter problems. Furthermore, the authors of (TTl 
and (m introduced a factor prior regularization, in order to 
yield improved results for visual data. 

In this paper, we investigate smoothness constraints for PD 
model based tensor completion. Smoothness is an important 
property that is embedded in many examples of real world 
data, such as natural images/videos, some spectral signals, 
and bio-medical data. For example, low-frequency (smooth) 
components of natural images computed by discrete cosine 
transform have higher magnitude than high-frequency com¬ 
ponents. This is a reason why the JPEG encoder is efficient 
and widely used in practice, and it indicates the usability of 
smoothness property for natural image processing. When we 
assume original images and additive noise are respectively 
smooth and non-smooth, a smoothness constraint helps to 
preserve a smooth original image and to remove non-smooth 
noise. In fact, matrix/tensor factorizations with smoothness 
constraints have many applications in which they are robust in 
the presence of noisy signals, such as blind source separation 
m, Gs, (751, video stracturing 1221 . visual parts extraction 
ItTI . genomic data analysis (bTll . and brain signal analysis 

|[I3. 

Total variation (TV) (531, which is defined as the /i-norm 
of the difference of neighbor elements, is often used to im¬ 
pose piece-wise smoothness constraints. Many methods have 
applied the TV approach in image restoration and denoising 
(47l, (261, (Ml, 0, da, ca. Furthermore, the TV constraint 
has been already applied to matrix completion in ( 181 , (271 . but 
until now it has not been investigated for tensors. Therefore, 
we apply the TV constraint to PD model based tensor com¬ 
pletion. Furthermore, we investigate an additional smoothness 
constraint, which is defined as the “/ 2 -norm” of the difference 
between neighbor elements, and is a stronger constraint than 
the TV constraint. Our concept of smoothness is not same 
as in other existing smoothness based completion methods 
o, (221, da, ( 73 . The key point of our approach is to 
assume that “latent component vectors” of factor matrices are 
smooth, and different levels of smoothness are enforced on 
different components adaptively. The details of the novelty of 
our model are discussed in Section IV-AI 

In addition, we propose a new strategy for a rank de¬ 
termination algorithm for the smooth PD model. A general 
strategy of the above mentioned “state-of-the-arts” algorithms 
involves setting the upper-bound of the rank for a tensor and 
optimizing the rank by using some procedures to remove 
redundant components, such as singular value shrinkage. How¬ 
ever, if we impose our smoothness constraint on a tensor, 
the number of required components increases, and we are 
not able to determine an upper-bound. Therefore, we propose 
a reverse strategy, estimating the rank of the tensor (i.e., 
the optimal number of components) by increasing the rank 
step by step, starting with a rank-one tensor. Our algorithm 
will stop when the smooth PD model fits the observed data 
sufficiently well. We carried out experiments involving tensor 


completion problems by using several difficult benchmarks 
of color images with different types of missing pixels (e.g., 
text masked, scratched images, and random pixels missing), 
incomplete MRI data, and multi-way structured facial data. 

The remainder of this paper is organized as follows. In 
Section JI-A| we explain the notations used in this paper. 
Section |II| reviews several existing matrix/tensor completion 
methods. In Section III we propose novel algorithms for 
tensor completion based on the smooth PD model. In Sec¬ 
tion we investigate the performance and applications of 
our algorithms, and compare them with some state-of-the-art 
methods. In Section |Vj we discuss several aspects of our work. 


Finally, we present our conclusions in Section VI 


A. Preliminaries and Notations 

A vector is denoted by a bold small letter a G A matrix 
is denoted by a bold capital letter A G A higher-order 

{N > 3) tensor is denoted by a bold calligraphic letter A G 
]^/ix/ 2 x - x7iv entry of a vector a G is denoted by 

a(i), and the (i, j)th entry of a matrix A G is denoted 

by A(i,jf). The (^ 1,^25 ...,iAr)th entry of an A^th-order tensor 
A is denoted by Ai-^i^...ij^, where in G {1, 2,..., In} and n G 
{1,2,..., A^}. The Frobenius norm of an A^th-order tensor is 

defined by ||A’||f := ■ 

A mode-k unfolding of a tensor A is denoted as AC(/c) ^ 
Pqj. example, a first mode unfolding of a third- 
order tensor AT G given by X(i)(ii, (23 — l )/2 + 

^ 2 ) = where in G {1, 2,..., In] and n G {1, 2, 3}. A 

modc-k multiplication between a tensor AT G j^Ax/sx - x/at 
and a matrix/vector A G is denoted hy y = A Xj. 

A^ G KAx---x/fc-ixi?x7fc+ix---x7iv^ entries given by 

yii---ik-irik+i---iN ~ ^ik ^and 

we have Y= A^X^k)^ 

II. Brief Review of Existing Methods 

In this section, we explain the methodology for ma¬ 
trix/tensor completion based on the low-rank assumption, and 
review the several state-of-the-art methods. 

Concerning matrix data, the most popular and basic ap¬ 
proach for matrix completion is the minimization of the matrix 
rank, subject to fitting the available elements: 

minimize rank ( X ) , s.t. XcL = Tq, ( 1 ) 

where X is a completed output matrix, T is an incomplete 
input matrix, ft denotes the indices of the available elements 
of T, the equation Xq = Tq means that X{i,j) = 
V(i, j) G O, and rank(X) denotes the rank of matrix 
X. Such a rank minimization approach has been applied in 
various research fields, such as machine learning (H, (5l and 
bioinformatics ( 6 OI . If an original matrix is structured as a 
low-rank matrix, then this rank minimization approach can 
be used to obtain an estimation of the ground truth matrix. 
However, rank is not convex function with respect to X, and 
rank minimization is generally NP-hard (25]| . Therefore, the 
nuclear-norm minimization method is widely used in practice: 

minimize ||X||*, s.t. Xq = Tq, (2) 
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where ||X||* = ^iCFi{X) denotes the nuclear norm, and 
(7i{X) denotes the ith largest singular value of the matrix X. 
In order to solve the optimization problem the following 
augmented Lagrange function can be applied: 

La{X,Y,W,I3) 

= \\X\U-{W,X-Y) + ^\\X-Y\\%, (3) 

where Yq = Tq is an auxiliary matrix, W is a. Lagrangian 
multiplier matrix, ^ is a. penalty (trade-off) parameter for the 
augmented term ||X — Y\\‘^p, and P is increased in each 
iteration step. This problem can be optimized by applying 
the alternating direction method of multipliers (ADMM) i). 
There are several existing papers that have applied the ADMM 
scheme to matrix or tensor completion problems ED, ED, 

lED, Ea. 

The smoothness property of data is often assumed for image 
completion. If we consider X to be a gray-scale image, and 
individual entries stand for the values of individual pixels, 
then the difference between neighbor pixels should typically 
be small. In 15^1 . the minimization of the total variation (TV) 
was employed as a smoothness constraint, which was defined 
by 

IIXIItv- := ^ ^/X,{i,jf+Xh{i,j)\ (4) 

where Xy{i,j) := X{i + 1, j) - X{i,j) and Xh{i,j) ;= 
X{i,j + 1) — X{i,j). Image completion via TV minimization 
was proposed in CD, and formulated in the form of the 
following optimization problem: 

minimize ||X||Ty, s.t. Xq^ = Tq. 

This problem is convex, and can be solved using gradient 
descent methods im. 

Furthermore, a combination of low-rank approximation and 
smoothness constraints was proposed in (271. Usually, natural 
images yield both of these structural features, and this combi¬ 
nation is quite efficient. The most straightforward approach for 
the combination of low-rank approximation and smoothness 
constraints is implemented by the following optimization 
problem: 

minMze | |X| |* + 7 I |X| Itv , s.tXQ = TQ, (5) 

where 7 is a trade-off parameter between nuclear norm mini¬ 
mization and TV minimization. In (23, a modified linear total 
variation was defined, as 

LTV(X) := J2{X,{i,jf+Xh{i,jf}, (6) 

and a smooth low-rank matrix completion method was pro¬ 
posed in the form of 

minmize ||X||* + 7 LTV(X), s.t. = Tq. (7) 

The optimization problem 0 was referred to as the linear 
total variation approximate regularized nuclear norm (LTVNN) 
minimization problem, and this problem can be solved using 
an ADMM like optimization scheme. This approach can be 


considered as the state-of-the-art method for low-rank image 
completion. However, LTVNN may be not useful for tensor 
completion. Concerning color images (third-order tensors), 
the authors separated an image into red, green, and blue 
frames, and applied the completion method to each color frame 
separately. However, when individual color frames are quite 
similar, such a separated method may not be effective. 

Tensor completion is a natural extension of matrix comple¬ 
tion with respect to the data structure, and it can use such 
structural information more effectively than matrix comple¬ 
tion. A basic method for tensor completion was proposed using 
simple low-rank tensor completion (SiLRTC) in (431 . and this 
was then formulated as the following constrained optimization 
problem: 


minimize 





s.t. Xq = Tq, 




||2 

\\f 


}■ 


( 8 ) 


where AT is a completed output tensor, 'T is an incomplete 
input tensor, is a low-rank matrix corresponding to the 
mode-i matricization form X 7 ), and ai and P are weight pa¬ 
rameters for individual cost functions. Essentially, this method 
attempts to minimize the tensor nuclear norm, defined as 
which is a generalization of the matrix nuclear 
norm. Instead of the minimization of <^i| |X( 7 11^, we 
minimize the nuclear norm of alternate parameters the 
mean squared error between alternate parameters and 

the mode-i unfolding of X. 

Note that when we strictly consider the minimization of 
J]] • (YiI |X( 7 11^, constraints X (7 = Y^'^^ should be added. The 
high accuracy low rank tensor completion (HaLRTC) method 
EH was proposed on the basis of this concept, by formulating 
the following optimization problem: 

N 

minimize aAlY^'^^W^, (9) 

s.t. X = y(^\ X^ = Tn, 


where is the tensor form of matrix Y^^\ The authors 
considered an augmented Lagrange function, which is an 
extension of ([ 3 , and can be minimized using the ADMM 
optimization scheme in similar manner. 

Recently, an alternative and very efficient completion 
method has been proposed, called simultaneous tensor de¬ 
composition and completion (STDC) lfT3 . This method is 
based on Tucker decomposition, and minimizes the nuclear 
norm of individual factor matrices by solving the following 
optimization problem: 


minimize 

s.t. 


N 

^ ai 1111 * + <5tr(#L#7 + 7 11 III, 

i=l 

( 10 ) 

X = gx{U^^^g„ Xn = Tn, 


where ^ is a core tensor, I/O) ^ factor matrix, G x 
:= eixi 1/(1) X 2 17(2) x 3 ...XivI 7 (^) is the Tucker 
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decomposition model of tensor, ^ := (8) • • • (8) 

is a unified factor matrix that consists of all factor matrices, 
( 8 ) is the Kronecker product, and L is a matrix designed 
by some prior information. The STDC method is based on 
the “low-rank Tucker decomposition” and “factor prior.” The 
minimization of can be interpreted as enforcing 

the similarity between individual components, which plays the 
role of a regularization, and provides some smoothed results 
for the visual data. Thus, STDC can be considered as a tensor 
extension of low-rank and smooth matrix completion. The 
authors considered an augmented Lagrange function, in order 
to solve in similar way. 

Finally, an additional efficient and promising method 
for tensor completion is given by fully Bayesian CAN- 
DEC OMP/PARAFAC tensor completion with mixture prior 
(FBCP-MP) 1771. The FBCP-MP method is based on the 
“probabilistic low-rank PD” and “mixture prior.” By using 
mixture prior, this probabilistic model enforces a kind of 
similarity between individual component vectors, similarly to 
STDC. Furthermore, FBCP-MP finds a PD with an appropriate 
tensor rank by using Bayesian inference, without need to tune 
or adjust any parameters. 

Hence, we can categorize the state-of-the art tensor com¬ 
pletion methods into three types: low-rank-nuclear-norm, low- 
rank-Tucker-decomposition, and low-rank-PD with optional 
smoothness constraints. The above mentioned papers on tensor 
completion did not discuss smoothness constraints in detail. 
However, smoothness is quite an important factor for visual 
data completion, which is obvious in matrix completion prob- 
lems (m, (50), EU- 


and the subsequent Tucker/tensor rank is then estimated by 
decreasing the number of components in each iteration. In this 
paper, we propose a completely different approach, which does 
not require the determination of the upper bound of the tensor 
rank, because in the proposed method we increase the number 
of components R gradually from 1 up to its optimal value. We 
call the new method “smooth PARAFAC tensor completion” 
(SPC). 

A. Fundamental Problem for the Fixed Rank SPC (FR-SPC) 

In this section, we consider the fixed rank version of 
SPC (FR-SPC). The optimization problem for FR-SPC is 
formulated as 

minimize — 3\\% (12) 

R 2 ^ 

r=l n=l 

S.t. Z = ^ O O • • • O , 

r=l 

Xn = Tn, = = 

Vre{l,...,i?},Vne{l,...,7V}, 

where AT is a completed output tensor in which the missing 
entries are filled using smooth PD approximation Z. Thus, the 
constraints Alq = 'Tq and JYq = Zq stand for 

y _ J (H5'^25 •••5at) ^ ^ 

- I otherwise ’ 


HI. Proposed New Method 

In this section, we propose a new algorithm for PARAFAC 
decomposition (PD) based tensor completion with smoothness 
constraints. First, the unconstrained PD model can be formu¬ 
lated as 

Z = O O . . . O (11) 

r=l 

(n) 

where o stands for outer product of vectors, Ur are feature 
vectors, called “components”, and Qr are scaling multipliers. 
By denoting factor matrices by ..., 

and a super-diagonal core tensor by Q, such that each super¬ 
diagonal element is Qrr...r = the PD model can be also 
denoted by ^ The minimum 

number of components R needed to satisfy equation O is 
called the “tensor rank” of Z. The exact PD with a minimal 
R is usually refereed to as canonical polyadic decomposition 
(CPD) or rank decomposition ll33]| . In general, when we 
impose the smoothness constraint on the feature vectors , 
the minimum number of components R increases, because the 
smoothness constraint decreases the fiexibility of the decom¬ 
position model. For this reason, it is difficult to determine 
the upper bound of the tensor rank of the original tensor 
when the smoothness constraint is imposed. In the STDC 
ca and FBCP-MP lITTI approaches, the upper bound of the 
Tucker/tensor rank is first determined as some large value. 


Next, p = [p^^\ p^‘^\ is a smoothness parameter 

vector, p e {1,2} is a parameter for selecting the types of 
smooth constraints, and the matrix G ^ 

smoothness constraint matrix, typically defined as 


/I -1 \ 

1 -1 

' i' -J 


(14) 


The first term, \\Al — Z\\p, of the objective function in 
represents the mean squared error (MSB) between the values 
of the observed entries 'Tq and the PD model Zq, because 
Afo = Tq and Af^ = Zq. Thus, the minimization of the 
first term of the objective function in 0 provides a PD of a 
given tensor 'T. 

The second term of the objective function in OH* is a 
penalty term, which assures smooth component vectors . 
Note that we have | 1(^) ~ + 

1)1^, and the minimization of this non-smoothness measure 
enforces the smoothness of individual feature vectors. When 
p = 1, the constraint term becomes the total variation 
(TV), and when p = 2 it becomes the quadratic variation 
(QV). In contrast to LTVNN (El, (271 . which imposes the 
smoothness constraint into the surface (output matrix), our 
approach imposes the smoothness constraint into the cause 
(latent components). The details of the difference between our 
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Algorithm 1 FR-SPC Algorithm 


1 

2 

3 

4 

5 

6 

7 

8 
9 

10 

11 

12 

13 

14 

15 

16 

17 

18 

19 

20 


■ o u 


w. 


input: T, R, p and p. 

initialize: {gr,{u^ 7 ^^ G randomly; 

Construct matrix by ( [T^ ; Vn G {1,A^} 

^ (Ef=l 9rU^r O ^ O • 

S = X — Yl^=l o O • 

repeat 

for r = 1,..., do 

yr ^ ^ QrU^r^ O O • • 

for n = 1,..., N do 

(n) . 7 -,(r,n) / 

^ argmm^^u^- ^ 

end for 


■ o u 


w. 


,(2)r 




■ P 


7«); 


£ <— “ QrU^^ o o • ■ • o ; 

0; 

end for 

until Change of ||£^|||^ after the next iteration is suffi¬ 
ciently small. 

Z ^ 9r'^r^^ o o • • • o 

output: X, Z. 


approach and other existing smoothing methods are discussed 
in Section EA] 

Note that the g‘^ terms are scaling factors for the smoothness 
constraint terms in OH*. This allows us to adaptively enforce 
the different levels of smoothness into different components. 
For image completion, the proposed method decomposes an 
image adaptively into a strong smooth background and a 
weaker smooth foreground. 


B. Derivation of the FR-SPC Algorithm 

We solve the optimization problem ^V2\ using the hierar¬ 
chical alternating least squares (HALS) approach m, ca. 
The HALS algorithm, included in block coordinate descent 
scheme, considers ‘feature-wise’ update which allows us to 
treat each unit-norm constraint separately. According to the 
HALS approach, we consider the minimization of the follow¬ 
ing local cost functions: 


minimize 

(1) m 


s.t. 


1 2 ^ 
n=l 


Zr = gr'^A^ 0 0 • • • 



II 

c 

II 

A 

[■^rjn, 

(15) 

||m (")||2 = l,Vn e 




where = X - Y,i^r ° “f ^ o • • • o and T[[^ = 

Tn - o uf'^ o ■ ■ ■ o u\^'’]q. The local-problem 

( p~5] ) only involves the rth components of PD with a fixed X. 
In order to solve ( p~5] ), we update gr, 


and reset yr sequentially as follows: 

^ diTgmmF^'^’'^\u), (16) 

Qr ^ argmin {g) , (17) 

9 

[yAn ^ [9rul^^ O O ... O (18) 

where Fp^\gr) = = ^\\yr — 9r'^r^^ o o 

••• + ^^n=l ^ 

and := {u G | ||r ^||2 = 1} is a sub-set of unit 
vectors. 

1) Update Rule for The problem ( p^ can be charac¬ 
terized as a unit-norm constrained optimization. When p = 2, 
objective function is a quadratic function with respect to u 
and it can be formulated as spherical constrained quadratic 
optimization l57l . When p = 1, the objective function is 
non-differentiable. Fortunately, many methods for solving this 
problem have been studied, such as sub-gradient methods 
m. To treat both cases (i.e., p = 1 and 2) equally, in 
this paper, we consider the gradient (or sub-gradient) based 
coefficient normalization update method ll48]| . 1211 . Although 
it does not guarantee global convergence generally, sticking 
into local minima of sub-optimization problem is not critical 
issue because the objective of sub-optimization is to decrease 
global cost function rather than to obtain strict solution of 
itself. As other options, several optimization schemes can be 
also applied such as Lagrange multiplier method ca, tangent 
gradient ll^ and optimization method on manifolds Q. In 
order the simplify the notation, we denote by and the k- 
th updates of and dFp^'^^ in the iterative algorithm, 

respectively, where dFp^'^\') is a gradient (or sub-gradient) 
of the objective function. Thus, the update U/e+i is given by 




Uj^ OLV]^ 




- 2aulvk 


a^v^Vk 


(19) 


where a a step size parameter, which should be tuned so 
that Fp^''^\uk+i) < Fp^'^\uk). This update rule can be 
considered as enforcing a sub-gradient descent on the hyper- 
spherical surface | |ix/c1 12 = | I'^/c+i 1 12 = 1- We iterate ( p^ until 
convergence is achieved. 

Next, we will demonstrate the gradient (or sub-gradient) of 
the objective function. The objective function Fp^'^\uk) can 
be simplified as 



Qrulv 


(n) 


-g^uluk, 


( 20 ) 


1 (n) /-k, (1)7^ (n—1)T 

where pf ^ := vec(J/^ Xi Ur X2 • • • x^_i Ur Xn-\-i 

^(n-i-i)T ,,, Xjv Thus, the gradient (or sub¬ 

gradient) of the objective function is given by 

dF^^’^\uk) = 

I fpML(^)^SGN(L(^^Uk) - gryi'"^ +g^uk (p = l) 

[ - grPr'^U grUk (P = ^) 

( 21 ) 
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Algorithm 2 Algorithm for Estimation of Optimal Number of 
Components R 

I: input: T, p, p, and SDR. 

3: FI — 1 \ 

4: repeat 

5: [Z, X] ^ FR-SPC(r, n, R,p, p); 

6: R^R + 1; 

1: until \\Zq, - TaWp < e 

8: output: Z, X. 


where the vector function SGN is defined by 


Algorithm 3 SPC Algorithm (accelerated version of Algo¬ 
rithm 1^ 

I: input: T, p, p, SDR, and u. 

2 : 

3: Xq, <- Tn; X^ average of Tn; 

4: Construct matrix by ( fT^ ; Vn € {1,A^} 

5: R i — Ij 

6: Initialize G randomly; 

7: gn ^ (AT, o o • • • o 

8: S = X — ^ o • • • o ; 

9- = 0’ 

10: t i — 0; 

11: pt t— ||£:|||'; 


SGN(a:) = [sgn(a;i), sgn(a; 2 ), .. 

1 {xj > 0) 

sgn(a;j) = { 0 {xj = 0) , 

-1 {xj < 0) 


sgn(a;j)]'^, 

(22) 

12: 

13: 

repeat 

Update {{ui^^}. 




(8th-16th lines); 


(23) 

14: 

^ \\E 1^; 


N 


,gr}^=i by the FR-SPC algorithm 


. This definition of SGN (a:) provides the most 
a:||i. 


for any x G 

unbiased sub-gradient of 

2) Update Rule for g^: Because the problem in ( p^ is an 
example of unconstrained quadratic optimization, the unique 
solution can be obtained analytically. The objective function 
can be simplified as 

Xfi'r - flr(Tr, ° O ■ ■ ■ O 




(24) 


n=l 


15 

16 

17 

18 

19 

20 
21 
22 

23 

24 

25 

26 


if 




e| — 

R i — i? -f 1; 
Initialize 

.(?) 


< u then 


G randomly; 


9r "G- (^, 

S ^ E — gRu\^ 

end if 
t i — t “I" 1; 

until /it < £ 


o u 


(2) 


■••on 


(AT) 

R 

■ O U 


); 

w. 


^ ^ Sf=i 9 '> 

output: X, Z 


( 1 ) ( 2 ) 

Ur O Ur O ' 


- O Ur 


W. 


Thus, the update rule is given by 

^ O O • • • O ui^^) 


(25) 


Finally, the FR-SPC optimization scheme can be summarized 
as shown in Algorithmic According to 1721 . several iterations 
of inner-loop of 10-13th lines in Algorithm may accelerate 
convergence speed for = 2, however it is not necessary for 

N>3 ED- 


C. Model Selection for the Number of Components R 

The key problem of FR-SPC is choosing the optimal number 
of components R. The PD model with too small an R is not 
able to fit the data, and the PD model with too large an R 
may result in over-fitting problems. 

In order to estimate an optimal value of R, we gradually 
increase R until we achieved the desired fit by formulating 
the following optimization problem: 

minimize R. (26) 

R 

S.t. W^Q ~ 'T^q\\‘f — ^5 

Z G s(i?,p, p), 

where §(i?,p, p) is defined as a set of all possible solution 
tensors via FR-SPC problem ( p3] ) with R, p, and p for any 
input tensors, in other words §(i?,p, p) is a tensor space 
spanned by R smooth components, and 5 is an error bound 


parameter. The criterion ( [26| ) allows us to finds a tensor Z 
based on the smooth PD model, with the minimum number 
of smooth components that guarantees a sufficient accuracy to 
fit the input tensor 'T. In order to guarantee that the signal to 
distortion ratio (SDR) is bounded within a specific threshold, 
we can define the error bound sls s = |||.. 

Note that mm 2 ;es(R,p,p){\\^n - TnWp} is a monotonically 
non-increasing function with respect to R. Algorithm is an 
implementation to solve ( |26| ), and it allows us to obtain a 
good estimation for R. However, this algorithm is quite time 
consuming. If the current value of R is too small to fit the 
PD model sufficiently well to the given data tensor during 
the iteration process, then we can stop the algorithm for the 
current R, increase as R v- i? + 1, and run the algorithm 
again for the new increased R. In this procedure, we propose 
switching to an increased R at an early stage of the iterations 
if the following condition is met: 


iMt -Mt+i| 

iMt+i - 


(27) 


where pt = H-^n — Tq,\\p, Z* denotes the PD model at the 
iteration step t, and > 0 is a stopping threshold (typically, 
R = 0.01). The left hand side of ( [Z7] ) is a measure of the 
convergence speed, so this condition means that when the 
convergence speed becomes substantially slow, we stop the 
iteration procedure for the current R. By incorporating this 
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Fig. I. Concept of optimization process in the SPC algorithm. An area 
R between arcs indicates the space of the tensor that is spanned by the 
components of R in the smooth PD model. An area that is hlled with waves 
indicates the space of the tensor satisfying the condition ||^n — 7^Q|||n <6. 
Concatenated arrows show the optimization process of the SPC algorithm, 
which gradually increases the number of components R from a rank-1 tensor 
to its end point, when the criterion is satisfied. 


Original Imcomplete TV (30.86 dB) QV (30.94 dB) 



Fig. 2. Iso-surface visualization of synthetic data and completion results 

using SPC-TV and SPC-QV. 


simple stopping criterion, we finally arrive at an improved 
and considerably accelerated algorithm for the automatic de¬ 
termination of the number of components R of FR-SPC. We 
call this method simply “smooth PARAFAC tensor completion 
with TV and QV minimization” (SPC-TV and SPC-QV), and 
it is summarized in Algorithm Fig. presents the concept 
of the optimization process of the SPC algorithm, and its 
details are provided in Section |V| Our algorithms have been 
implemented in MATLAB, and are available for readers at 
https://sites.google.com/site/yokotatsuya/home/software 

Because the best low-rank approximation problem for 
smooth PD model may be often ill-posed in the same way as 
for unconstrained PD model ESI, our algorithm allows some 
redundant terms in smooth PD model. Focusing on “good 
approximation” rather than “minimum rank decomposition”, 
although estimation of number of components for smooth PD 
model is difficult, its accuracy does not need to be so high if 
the reconstructed tensor is well approximated in practice. 

IV. Experimental Results 

In the experiments to be described here, we apply our 
proposed algorithms to one synthetic and several real-world 
visual datasets, to demonstrate the sensitivity of the parameters 
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qI-^^^^^^ 

0 50 100 150 200 250 300 

#iterations 


(a) TV 



Fig. 3. Convergence of mean squared error between and 7 q, and 
number of components R as a function of the number of iterations in 
the SPC algorithm using the synthetic data with missing voxels rate of 
80%. We performed simulations for various values of the stopping threshold 
y G {0.1, 0.01, 0.001, 0.0001}. Other parameters were set as SDR= 30 dB, 
p = [0.01,0.01,0.01] for TV smoothing, and p = [1.0,1.0,1.0] for QV 
smoothing. 


u and p, and the advantages of our proposed algorithms 
compared with the existing state-of-the-art methods LTVNN 
l27l . HaLRTC ED, STDC Ha, and FBCP-MP (Til- 


A. Convergence Properties Using a Synthetic Third-Order 
Tensor 

Fig. shows a visualization of the iso-surface of a synthetic 
3rd-order tensor, and its completion results by achieved by 
applying the SPC algorithm with TV and QV constraints. The 
synthetic tensor was constructed using a combination of four 
multi-dimensional Gaussian functions, and 80% of the voxels 
were randomly removed in its incomplete tensor. 

Fig. shows the plots of the mean of squared errors (MSB) 
between the tensors and 'Tq, and the number of estimated 
components R with respect to number of iterations of the 
SPC algorithm, with p= [0.01,0.01,0.01] for TV smoothing 
and p = [1.0,1.0,1.0] for QV smoothing. In addition, we 
set SDR= 30 dB and applied various switching threshold 
values of z/ G {0.1,0.01,0.001,0.0001}. We used the synthetic 
smooth third-order tensor with a random 80% of voxels 
missing. From Fig. we can see the monotonic convergence 
of MSE and the change in the number of components of R are 
closely related to the change of the MSE. Note that the number 
of components, R, is updated when the convergence speed of 
the MSE becomes slow. Eurthermore, the algorithm is seen to 
require a much larger number of iterations for smaller values 
of u. However, the results for the final number of components 
R were the same for different cases with o > 0.01. This 
means that for too small a value of o the convergence of 
the algorithm is relatively slow. According to our extensive 
experiments, z/ = 0.01 is an acceptable default value. 
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(b) Relative SSIM subtracted by LTVNN 


Fig. 7. Comparison of performances (PSNR and SSIM subtracted by baseline (LTVNN)) for all benchmark images and for various missing rates (from 60% 
to 95%) obtained using the proposed methods (SPC-TV and SPC-QV) and the state-of-the-art algorithms LTVNN, HaLRTC, STDC, and FBCP-MR 


B. Color Image Completion 

In this experiment, we applied our proposed and existing 
methods to 10 color-images (Fig. [^. The size of all color 
images is 256 x 256 x 3, and we generated incomplete data 
by deleting elements of the images randomly, with several 
different missing ratios G {60,70,80,90,95%}. For the im¬ 
ages “Giant” and “Wasabi,” all color elements of individual 
pixels (so called dead pixels) were deleted for all missing 
ratios. In order to evaluate the quality of completion, we used 
the peak signal-to-noise ratio (PSNR) and structural similarity 
index (SSIM) (661. PSNR is defined as 10 logio(255VMSE), 
and SSIM is defined by using the average and variance 
parameters of the local small blocks in an image. As an 
improved evaluation measure, SSIM is usually considered as 
an improvement on PSNR, because SSIM evaluates both the 
error and a kind of “local smoothness” of images for visual 
quality. In this experiment, we generally set SDR = 25 dB 
and V = 0.01. 

1) Convergence Behavior of SBC and FR-SPC Algorithms: 
Fig. shows the convergence curves for PSNR and SSIM 
for 10 runs with random initializations in the SPC and FR- 
SPC algorithms, using “Lena” with a random 95% of pixels 
missing, with p = 2 and p = [0.5, 0.5,0]. In both methods, 
the number of components is the same. Interestingly, FR-SPC 
converged to local minima that are larger than the convergence 
points for SPC. Furthermore, the PSNR and SSIM values 
achieved by SPC were significantly better than those for the 
FR-SPC algorithm. This implies that the “rank-increasing” 


approach results in an algorithm that converges to better local 
minima. 

2) Performances for Various Values of p: Fig. shows 
the PSNR, SSIM, and number of components R for different 
values of the smoothing parameter p. A benchmark image, 
“Lena,” with a random 80% of pixels missing, was used in this 
experiment. We applied the SPC-TV and SPC-QV algorithms, 
with various values of p = [p, p, 0]. We defined p := r/(l—r), 
and tested for a wide range of r G [0.05, 0.95]. In this exper¬ 
iment, we set the maximum values for R as 3000, because 
of the limitations of the memory. We found that larger values 
of p resulted in considerable increases in PSNR and SSIM 
in QV/TV smoothing. However, too large a p substantially 
decreases the performances in TV smoothing. Lurthermore, 
a larger p requires a larger number of components R to 
achieve a more precise fit, and also a larger computational 
cost. The SPC algorithm with TV smoothing required a larger 
number of components R than QV smoothing, and usually 
the performances, in terms of PSNR and SSIM, were inferior 
in comparison with QV smoothing. According to our results, 
QV smoothing performs, in general, better than TV smoothing 
for tensor completion. We discuss our interpretation of these 
results in Section |WB] 

3) Comparison with the State-of-the-Art Methods: Pig. 
shows the results of tensor completions using all 10 benchmark 
images with various random pixel missing ratios using the 
proposed method with TV and QV smoothing, and compares 
these with the performances of four state-of-the-art methods. 
































































YOKOTA et al.\ SMOOTH PARAFAC DECOMPOSITION FOR TENSOR COMPLETION (VERSION 3) 


9 


Incomplete lALM LTVNN HaLRTC STDC FBCP-MP Proposed (TV) Proposed (SV) 



Fig. 8. Simulation results for various incomplete images (top to bottom): text masked image, scratched image, random dead pixels missing (95%), and 
random voxels missing (95% and 99%). Incomplete images are shown in the first column, and reconstructed images are shown in the other columns. 


The parameters for the SPC algorithms were set as p = 
[0.05,0.05,0] for TV smoothing, and p = [1.0,1.0,0] for QV 
smoothing. The hyper-parameters for the other methods were 
optimally tuned manually. Interestingly, LTVNN outperformed 
HaLRTC for several images with respect to SSIM. However, 
it is difficult to say whether LTVNN outperformed HaLRTC 
with respect to PSNR. This implies that the smoothness 
constraint improves the structural similarity of images. Thus, 
we can conclude that the smoothed methods (i.e., LTVNN, 
STDC, FBCP-MP, and the proposed method) are evaluated 
better by SSIM than PSNR. The proposed method with TV 
smoothing was inferior to STDC and FBCP-MP in most cases 
(e.g., “Peppers” and “House”). However, the proposed method 
with QV smoothing considerably outperformed the all of the 
existing methods for all of our benchmarks. 

Fig. [8] presents several results for the completed images, 
using the existing methods and the proposed method. We 
prepared a text masked image of “Barbara,” a scratched image 
of “Peppers,” an incomplete image with random pixels missing 
(dead pixels) of “Giant,” and two incomplete images with 
random voxels missing of “Lena.” In addition, for the first 
time we tested an extreme case where the “99%” of pixels 
are missing, and such a large missing ratio may constitute 
an interesting challenge in this research field. There were 
rather small difference between the performances of most 
of the methods in the first and second rows. However, we 
observed significant differences for the images missing random 


pixels/voxels, especially when the ratio of missing pixels was 
high. In the reconstructed images in the last three rows for 
the non-smooth methods (i.e., lALM and HaLRTC), it was 
difficult to recognize the shapes of objects. In the reconstructed 
images in the second to last row, the smooth matrix completion 
method (i.e., LTVNN) was not able to reconstruct the woman’s 
facial features, but it is not difficult to recognize the woman’s 
facial features in the second to last row where STDC, FBCP- 
MP, and SPC were applied. For the last row with a 99% 
missing ratio, the shape of “Lena” could be recognized only 
for FBCP-MP and the proposed SPC-QV, but the performance 
of SPC-QV was better than for FBCP-MP. 

C. MRl Image Completion 

Fig. 1^ shows the results for several slices of MRI 3D- 
images (of size 109 x 91 x 91), with 60%-95% of voxels 
missing, obtained by HaLRTC, STDC, and SPC with TV and 
QV smoothing. Since the MRI image is smooth in all three 
dimensions (modes), the hyper-parameters of the SPC algo¬ 
rithm were set as p = [0.01, 0.01, 0.01] for TV smoothing, and 
p = [0.5, 0.5, 0.5] for QV smoothing. The hyper-parameters 
of other methods were tuned optimally and manually. Table |T| 
shows the performances (SDR) of the individual algorithms. 
From Fig. and Table |T| we can observe that the SPC 
algorithm succeeded in completing the incomplete MRI 3D- 
images, even with a 90%-95% missing ratio of voxels, and also 
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Airplane Baboon Barbara Facade House Lena Peppers Sailboat Giant Wasabi 



Fig. 4. Test images are 256 x 256 pixels color images, which consist of 
three layers of red, green, and blue color images. 



Fig. 5. Comparison of the SPC and FR-SPC algorithms. 



Fig. 6. Results of PSNR, SSIM, and the number of components of the SPC 
algorithms, using the image “Lena” with a missing ratio of 80%. We tested 
various values of p = [p, p, 0]^, where p is calculated by p = r/(l—r). The 
algorithms were relatively insensitive to the wide variation of the parameters. 


significantly outperformed the other methods, with respect to 
the SDR. 


D. 4th-order Tensor Completion Using a CMU Faces Dataset 


Next, we applied the three completion methods to a fa¬ 
cial image database, provided by Carnegie Mellon University 
(CMU), called “CMU faces” |[54ll . The CMU dataset consists 
of 6930 facial images of size 32 x 32 pixels. The 6930 
facial images represent 30 individual people, 11 poses, and 
21 kinds of illumination (i.e., 6930 = 30 x 11 x 21). Thus, 
the size of the CMU dataset is (30 x 11 x 21 x 1024). Since 
each facial image is generally smooth in this dataset, but the 
neighboring people, poses, and illuminations are not always 
similar, we enforced the smoothness constraint in only the 
final fourth dimension. Each facial image is described as a 
1024-dimensional vector, which is a vectorized from a 32 x 32- 
gray-scaled-image. Therefore, in this experiment we employed 
a smoothness constraint matrix that applies a combination of 
vertical and horizontal smoothness: 


= 



(28) 


where and are the vertical and horizontal smooth- 
ness matrices, respectively. The smoothness parameters of the 
SPC were set as p = [0,0,0,0.01] for TV smoothing and 


Incompleted (60% missing) 

Incompleted (80% missing) 

Incompleted (95% missing) 


HaLRTC (60% missing) 

HaLRTC (80% missing) 

HaLRTC (95% missing) 



#1 If II It 

STDC (60% missing) 
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Proposed (TV) (95% missing) 
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Proposed (QV) (80% missing) 
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Fig. 9. Results of tensor completion for MRI data by using HaLRTC, STDC, 
and the proposed SPC -TV and SPC -QV methods for various missing rates. 



Fig. 10. Results of tensor completion for the CMU face dataset by using 
HaLRTC, STDC, and the proposed SPC -TV and SPC -QV methods. Values 
of SDR [dB] are described in each subfigure. Considerable improvements in 
performance were achieved. 


p = [0,0, 0,0.1] for QV smoothing. Fig. 10 shows a section 
of the original and incomplete CMU faces, with 80% of faces 
missing, and its completed results obtained by the HaLRTC, 
STDC, and SPC algorithms. In this case, HaLRTC failed to 
complete the faces, while STDC provided several broken faces. 
However SPC achieved excellent results. 


V. Discussions 

A. Smooth PARAFAC Decomposition Model 

In Section |n| we introduced three methods for “low-rank” 
and “smooth” matrix/tensor completion: LTVNN 1271 , STDC 
lfT2l . and FBCP-MP (73. We will now expand on the differ¬ 
ences between these models and our proposed model. 
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TABLE I 

Signal to distortion ratio of MRI completion 


methods 

60% 

70% 

80% 

90% 

95% 

HaLRTC 

14.59 

12.40 

9.95 

6.93 

4.99 

STDC 

14.97 

13.78 

13.28 

12.81 

11.46 

Proposed (TV) 

18.13 

16.23 

14.16 

11.20 

8.36 

Proposed (QV) 

20.65 

19.09 

17.29 

14.73 

12.79 


First, our smooth tensor decomposition model is based on 
enforcing the “local similarity between neighbor elements in 
each component,” i.e., the vectors of the factor (component) 
matrices. In contrast, LTVNN da imposes the smoothness 
constraint on the “output matrix itself,” but not on the “com¬ 
ponents.” Furthermore, the smoothness of both STDC ca 
and FBCP-MP ITTTI is designed to enforce the similarity 
(or smoothness) between “individual components.” Thus our 
method is quite different from above methods lIZTll . [121, 
1771. In (sa, a generalized framework for tensor factorization 
with missing entries is proposed and smooth factor constraint 
is included in this framework. However, it is considering 
fixed-rank optimization problem, and does not propose model 
selection method for R. 

Second, our method is enforcing the different levels of 
smoothness for different components adaptively. Note that 
the smoothness constraint term in our objective function 
is multiplied by g‘^. Usually, the distribution of \gr\ of the 
PARAFAC decomposition is similar to the distribution of the 
“singular values” of the singular value decomposition, which 
may decrease exponentially from largest to smallest especially 
for smooth image data. Thus, our model can be interpreted 
as constructing a smooth image by building up the various 
levels of smooth components, beginning with the smoothest 
one. It should be emphasized that the adaptive smoothness in 
our model is a key-technique to perform large improvements 
from state-of-the-art methods for image completion. 

B. TV Smoothing and QV Smoothing 

In this paper, we consider two types of smoothness con¬ 
straints: TV and QV smoothing. In image recovery problems, 
such as denoising and restoration, it is generally considered 
that the TV constraint is better than QV. On the other hand, 
our result implies that the QV constraint performs better than 
TV for the image completion problem. This is a new and 
important discovery for the image completion research com¬ 
munity, because many studies still consider the TV constraint 
for completion problems (TSl, lITTIl . 

The advantages of QV for the completion problem can 
be summarized as follows. If we consider the minimization 
of the TV/QV term of a signal z = [0, NaN, 2, NaN, 0]^, 
the respective solutions are ztv = [ 0 , <^5 2 , 6 , 0 ]^, where 
a,b e [0,2] and zqv = [0,1, 2,1,0]^. Since the QV term 
has a strong convexity compared with the TV term, the 
uniqueness of zqv can be enforced. Furthermore, ztv has 
strong possibility of performing bumps in a component such 
as [0, 0, 2,0, 0]^, and including [0,2, 2, 2,0]^ in the solution 
set. In cases with high missing ratio, such as 90% or 95%, such 
bumps will be performed very often by TV minimization. 


C. Rank Increasing Approach 

In this section, we discuss the some advantages of rank 
increasing model selection which is employed in our method. 

First, rank-increasing approach is more suitable than rank- 
decreasing approach for our smooth PD model. If there is no 
smoothness constraint, then the weak upper-bound of tensor 
rank is 1^) for a general A^-th order tensor X G 

]^/ix/ 2 x - x7a^^ because any X can be factorized by only one 
dense matrix = AC(/c) and a unit matrix constructed as 
I = o o ... o 0 ... © G 

However, such a unit matrix is not suitable 
for a smooth constraint, and it is difficult to estimate the upper- 
bound of the tensor rank, because it depends on the imposed 
level of smoothness. In contrast, the rank-increasing approach 
can easily be applied to various problems in practice since it 
does not require prior information for the tensor rank. 

Second, the rank-increasing scheme is useful for the ini¬ 
tialization. In general, even if we know the exact minimum 
(canonical) tensor rank of the smooth PARAFAC/Tucker de¬ 
composition for a completion problem, many local minima 
may exist for the optimization problem. Considering several 
local optimal solutions for the problem, then the results of 
optimization methods will depend on the initialization. Fig. 
shows the concept of the optimization process for the SPC al¬ 
gorithm. The algorithm starts from a rank-one tensor {R= 1), 
the initialization of which has no critical meaning at this 
moment. The initialization of the rank-i? tensor factorization 
is given by the rank-(i? — 1 ) tensor factorization, which gives 
a better initialization for the completion problem, because this 
initialization arises from the lower-rank tensor space. When 
the algorithm is finally stopped, the SPC algorithm is able 
to find a good solution that is close to the lower-rank tensor 
space. 

In recent years, rank-increasing approach is a relatively hot 
topic in matrix and tensor factorization models il, Eol, ( 53 , 
(631, (TOl . The main statement in these papers is that the rank- 
increasing (greedy) approach is more efficient for large scale 
and ill-posed setting than nuclear-norm based scheme. Our 
method is an additional positive result in this topic. 

D. Convergence Property 

Unfortunately, unconstrained low-rank tensor approximation 
problem for higher ranks can be ill-posed or ill-conditioned 
in many cases csi. However, some other positive results 
have been also established in recent years: Considering the 
unconstrained problem for low-rank approximation without 
missing values, a rank-one approximation by the ALS al¬ 
gorithm achieves a local convergence (641 . (76l . PARAFAC 
decomposition with smooth spline components preserves the 
local convergence of the ALS algorithm (52l . The local con¬ 
vergence of the rank-one tensor approximation with penalized 
smoothing is proven in m As the latest result, rank-one 
ALS with regularization can obtain global convergence via 
Lojasiewicz inequality (62l . Please note that ALS and HALS 
are completely equivalent for rank-one tensor approximation 
R=l. 
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For the case of i? > 2, some results of block coordinate 
descent (BCD) scheme can be applied for HALS. In 16^ . 
local convergence of BCD scheme consisting of multiple 
convex sub-optimizations is obtained for non-negative tensor 
factorization model. In ED, a regularized PD based tensor 
completion algorithm is proposed and its local convergence 
is obtained. These results can be applied for our optimization 
problem when p = 2 except unit-norm constraints. 

Unfortunately, theoretical result about convergence of our 
algorithm is not obtained. The difficulty of analysis for our 
model may be caused by “unit-norm constraint.” This problem 
will be addressed in our future works. 

VI. Conclusions 

In this paper, we proposed a new low-rank smooth 
PARAFAC decomposition method for tensor completion prob¬ 
lems. Our approach and algorithms are quite different from 
existing methods. Instead of setting the upper bound of the 
expected rank of the tensor, our algorithm increases the 
number of components gradually until the optimal rank is 
achieved. We considered two types of smoothness constraint in 
our SPC method: total variation (TV) and quadratic variation 
(QV). Our method was shown to outperform the state-of-the- 
art algorithms, in particular HaLRTC, STDC, and FBCP-MP. 
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